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Abstract 

We highlight a topological aspect of the graph limit theory. Graphons are limit objects 
for convergent sequences of dense graphs. We introduce the representation of a graphon 
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on a unique metric space and we relate the dimension of this metric space to the size of 
regularity partitions. We prove that if a graphon has an excluded induced sub-bigraph then 
the underlying metric space is compact and has finite packing dimension. It implies in 
particular that such graphons have regularity partitions of polynomial size. 

1 Introduction 

One can define convergence of a growing graph sequence [H [5J [5] , and construct a limit object to 
such a sequence [TT] in the form of a symmetric measurable function W : J x J — > [0, 1], where 
J is any probability space (one may assume here that J = [0, 1] with the Lebesgue measure, but 
this is not always convenient). We call the pair (J, W) a graphon. 

The goal of this paper is to show that one can introduce also a topology on J (in fact, a 
metric), and that topological properties of this space are related to combinatorial properties of 
the graphon (or of the graphs whose limit it represents). A related metric was introduced in 
[L2"] . and the topology on J was used in [15] . 

The theory of graph limits is tied to the Regularity Lemma of Szemeredi [T4] [15] in several 
ways. In |12] it was shown that the Regularity Lemma is equivalent to the compactness of 
the space of graphons in an appropriate metric, and also to a "dimensionality" of particular 
graphons. This paper relates to the latter result. 

The metric in question is simply the L\ metric on functions W(x, .), x € J. This metric 
itself can be weird (it may not even be defined on all points of J). We show in Section [5J that 
that every graphon is "equivalent" (technically: weakly isomorphic, see the end of Section [3]) 
to a graphon (J, W) with special properties: J is a complete separable metric space, and the 
probability measure on J has full support. We call such graphons pure. We also prove that the 
pure version of a graphon is uniquely determined up to changing the function W on a 0-set in 
each row. We define another metric in which J is compact, and characterize the cases when 
the two define the same topology. We prove that several important functions defined on J are 
continuous in this topology, which shows that it is indeed the "right" topology to define on J. 

In Section [4] we show that topological properties of pure graphons are related to their graph- 
theoretic properties. Our main result states tha if we exclude any bipartite graph from the 
graphon, then J must be compact and finite dimensional. 

In |12] it was shown that weak regularity partitions of a graphon (J, W) (which generalize 
weak regularity partitions of graphs in a natural way) correspond to covering J with sets of small 
diameter. In Section [S] we give a stronger and cleaner version of this result. Combined with the 
results in Section [4] we obtain the following fact: If a graph does not contain a fixed bipartite 
graph F as an induced sub-bigraph, then it has polynomial size strong regularity partitions (in 
the error bound e). 

A motivation for our paper comes from extremal combinatorics. In [13] we study the structure 
of graphons that arise as unique solutions of extremal problems involving the densities of finitely 
many subgraphs (we call such graphons finitely forcible). Such graphons come up naturally 
in extremal graph theory. Quite interestingly, all the examples of finitely forcible graphons 
produced in [T5] have a compact and finite dimensional underlying metric space. The question 
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arises wether every extremal problem (involving a finite number of subgraph densities) has a 
solution of this type. 

Finally we mention that graph limit theory has a close connection to the theory of dynamical 
systems. Probability spaces with measure preserving actions can often be endowed by a natural 
topology in which the action is continuous. The corresponding theory is called topological dy- 
namics. Informally speaking, we can say that the relationship between graphons and topological 
graphons is similar to the relationship between dynamics and topological dynamics. 

2 Preliminaries 

We make a technical but useful distinction between bipartite graphs and bigraphs. A bipartite 
graph is a graph (V, E) whose node set has a partition into two classes such that all edges connect 
nodes in different classes. A bigraph is a triple (Ui, U2,E) where U\ and U2 are finite sets and 
E C Ui x Ui- So a bipartite graph becomes a bigraph if we fix a bipartition and specify which 
bipartition class is first and second. On the other hand, if F = (V, E) is a graph, then (V, V, E') 
is an associated bigraph, where E' = {{x, y) : xy e E}. This bigraph is obtained from F by a 
standard construction of doubling the nodes. 

If G = iy,E) is a graph, then an induced sub-bigraph of G is determined by two subsets 
S, T C V, and its edge set consists of those pairs (x,y) E S x T for which xy £ E (so this is an 
induced subgraph of the bigraph associated with G) . 

Let Ji = (Qi,Ai,Wi) (i = 1,2) be (standard) probability spaces. A measurable function 
W : Ji x J 2 —¥ [0, 1] is called a bigraphon. A graphon is a special bigraphon where J\ = J2 = J 
and W is symmetric: W(x, y) — W(y, x) for all 1,1/6 J. 

For a fixed probability space J, graphons can be considered as elements of the space Lqq^J x 
J). The norm that it most important in their is study is, however, not the L x norm, but the 
cut-norm, defined by 



A graphon (J, W) is called a stepfunction, if there is a partition of J into a finite number of 
measurable sets Si,...,S n so that W is constant on every Si x Sj . The partition classes will be 
called the steps of the stepfunction. 

Every graph F = (V, E) can be considered as a graphon, if we consider V as a finite probability 
space with the uniform measure, and E, as the indicator function of adjacency. We can resolve 
the atoms into intervals of length 1/| Vj , to get a graphon ([0, 1], Wf) (which is a stepfunction). 
More explicitly, we split [0, 1] in \ V\ equal intervals Li, and define Wf(x, y) — E(i,j) for ix G Lj 
and y e Lj. This graphon is weakly isomorphic to (V, E) (see below). 

In a similar way, every bigraph can be considered as a finite bigraphon, and defines a bi- 




SxT 



We will also use the L\ norm 





JxJ 



graphon ([0,1], [0,1], W». 
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Remark 2.1 We could consider the version of this notion where J\ = J 2 but W is not necessarily 
symmetric. Such a structure arises as the limit object of a convergent sequence of directed graphs 
with no parallel edges, and therefore can be called a digraphon. We do not need them in this 
paper. 

Every bigraphon [J\, J 2 , W) can be considered as a linear kernel operator Li(Ji) — > ^00(^2), 
defined by 

/-> I W(.,y)f(y)dy. 



Of course, this operator reamin well-defined if we increase the subscript in L\ in the domain and 
lower the subscript in Lrx, in the range. In the case of a graphon (J, W), it is useful to consider 
it as an operator L>2{J) — > L2(J), since it is then a Hilbert-Schmidt operator, and a rich theory 
is applicable. In particular, we know that it has a discrete spectrum. 

If (Ji,J2,U) and (J 2 , Jz,W) are two bigraphons, we can define their operator product 
(J u J 3 ,UoW) by 

(UoW)(x,y) = [ U(x,z)W(z,y)dz. 



■h 

(We will write dz instead of d7r 2 (z), where 7r 2 is the measure on J 2 : integrating over J 2 means 
that we integrate with respect to the probability measure of </ 2 .) 

The notion of the density of a graph in a graphon has been introduced in [7] . Here we need 
several versions, which unfortunately leads to some messy notation. For a graphon (J, W) and 
graph F = (V, E), we associate a variable x v G J with every node v € V, and define 

t{F,W-x)= JJ W(x u ,x v ), t{F,W)= I t(F,W;x)dx. 

uv£E(F) j V 

We can think of t(F, W) as "counting subgraphs isomorphic to F" . We also need the induced 
version: 

t ind {F,W;x)= Y[ W(x u ,x v ) Y[ {1-W(x a ,x v )) 



t ia d(F,W) = / t ind (F,W;x)dx. 
J v 

For any subset S C V, we define ts(F, W;.) : J s — > R by integrating only over variables 
corresponding to V\ S: If x' and x" denote the restrictions of x g J y to 5 1 and V\S, respectively, 
then 

t s (F,W;x')= / t(F, W; x) dx" . 



j v \ s 

Note that i (F, PF) = t(F, W) and VF; .) = VF; .). 

These quantities have obvious analogues for bigraphs and bigraphons. For a bigraphon 
( Ji, J 2 , W) and bipartite graph (t/i, C/ 2 , we introduce variables i„ G Ji (u € f/i) and G J 2 
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(v 6 U2), and define 

t b (F,W;x,y) = JJ W(x usI / v ), 

uvEE(F) 

Again, we define an induced version: 

$ nd (F,W;x,y) = TJ W(z i)% ) J] 0--W{xi, yj )) 

ii£E(F) ieui.ieu? 

^ nd (F,W)= y /" iindC-F > W; x, y) dy dx. 
Assume that subsets Si C [/^ are specified. We define the function t b (F, W; .) : jf 1 x jf 2 -> R 

by 

ts u S 2 (F,W;x',y')= J J t b (F,W;x,y) dy" dx" , 

r c?i\Si ,c/ 2 \s 2 

where, similarly as above, x' and x" denote the restrictions of x € J^ 1 to Si and f/i \ Si, 
respectively, and similarly for y. We can define t b nd . s s (F, W)(x',y') analogously. 

Two graphons ( J, W) and (J',W) are weakly isomorphic if for every graph F, t{F,W) = 
t(F,W). Various characterizations of weak isomorphism were given in [SJ. Every graphon is 
weakly isomorphic to a graphon on [0, 1] (with the Lebesgue measure), and also to a (possibly 
different) graphon which is twin-free in the sense that W(x, .) and W(x',.) differ on a set of 
positive measure for all x 7^ x' . 

3 The topology of graphons 
3.1 The neighborhood distance 

Let (J, W) be a graphon. We can endow the space J with a distance function by 

r w (x,y) = \\W(x,.)-W(y,.)\\i. 

This function is defined for almost all pairs x, y\ we can delete those points from J where 
W(x, .) ^ Li(W) (a set of measure 0), to have rw defined on all pairs. It is clear that rw is a 
pre- metric (it is symmetric and satisfies the triangle inequality). We call rw the neighborhood 
distance on W. 

We also define metrics on bigraphons, endowing the spaces Ji and J2 with distance functions 

by 

n(x,y) = \\W(x,.)-W(y,.)\\i (x,y e Ji), 
r 2 (x,y) = \\W(.,x)-W(.,y)\\i (x,y e J 2 ). 

These functions are defined for almost all pairs x, y. 



t b (F,W) 



t b {F,W;x,y)dydx. 



Ul jU 2 



J; 
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Example 1 Let S k denote the unit sphere in R fc+1 , consider the uniform probability measure 
on it, and let W(x, y) = 1 if X ■ y > and W(x, y) = otherwise. Then (S k , W) is a graphon, 
in which the neighborhood distance of two points a,b € S is just their spherical distance 
(normalized by dividing by w). Furthermore, 1 — 2(W o W)(x,y) is just the spherical distance 
of x and y, and from here is is easy to see that the similarity distance is within constant factors 
of the neighborhood distance. 

Example 2 Let (M,d) be a metric space, and let it be a Borel probability measure on M. 
Assume that the diameter of M is at most 1. Then d can be viewed as a graphon on (M, d). For 
x, y € M, we have 



so the identity map (M,d) — > (M, rj) is contractive. This implies that if (M,d) is compact, 
and/or finite dimensional (in many senses of dimension), then so is (M, rj). For most "everyday" 
metric spaces like (like segments, spheres, or balls) r^{x,y) can be bounded from below by 
Q(d(x, y)), in which case (M,d) and (M, r^) are homeomorphic. 

More generally, if F : [0,1] — > [0,1] is a continuous function, then W(x,y) = F(d(x,y)) 
defines a graphon, and the identity map (M, d) — > (M,rw) is continuous. 

Example 3 Finitely forcible graphons, mentioned in the introduction, give interesting examples, 
for whose details we refer to [13]. One class is stepfuctions (equivalent to finite weighted graphs), 
which were proved to be finitely forcible by Lovasz and Sos [lQj : for these, the underlying metric 
space is finite. Other examples introduced in |13) provide as underlying topologies an interval, 
the Cantor set, and the one-point compactification of N. 

3.2 Pure [bi]graphons 

A bigraphon ( J 1; J 2 , W) is pure if (J,, r t ) is a complete separable metric space and the probability 
measure has full support (i.e., every open set has positive measure). This definition includes that 
Ti{x,y) is defined for all x,y G Ji and Ti(x,y) > if x ^ y, i.e., the bigraphon has no "twin 
points" . We say that a graphon is pure, if the underlying metric probability space is complete, 
separable and the probability measure has full support. 

Theorem 3.1 Every [bi] graphon is weakly isomorphic to a pure [bi] graphon. 

Remark 3.2 It was shown in [2] that every graphon is weakly isomorphic to a graphon on 
a standard probability space with no parallel points, which means that for any two points 
x, x' £ J, W(x, .) and W(x', .) differ on a set of positive measure. Lemma I3~H can be considered 
as a strengthening of this result. 

Proof. We give the proof for bigraphons; the case of graphons is similar. We assume that 
Ji and Ji are standard probability spaces; this can be achieved similarly as for graphons. Let 




\d(x, z) — d(y, z)\ dn(z) < / d(x, y) dn(z) = d(x, y), 



A I 
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T\ be the set of functions / £ L\[J 2 ] such that for every Ti-neighborhood U of /, the set 
{x £ J\ : W(x, .) £ U} has positive measure. 

Claim 3.3 For almost every point x £ J\, W(x, .) £T±. 

Indeed, it is clear that for almost all x £ J\, W(x, .) £ Li[J 2 ). Every function g £ Li[J 2 ] \ Ti 
has an open neighborhood U g in Li[J 2 ] such that n\{x £ J\ : W(x, .) € J7 S } = 0. Let 
U = Ug^Ti Ug. Since Ti[J2] is separable, U equals the union of some countable subfamily 
{U gi : i £ N} and thus ir^x £ Jj : W(x, .) £ U} = 0. Since if W(x, .) ^ T x then W{x, .) <= 17, 
this proves the Claim. 

Clearly T% inherits a metric from Lipjj]) and it is complete and separable in this metric. 
The functions W(x, .) are everywhere dense in Ti(W) and have measure 1. It also inherits a 
probability measure tt^ from J\ through 

ir[(X) = tti{.t £ Oi : W(x, .) £ X}. 

So Ti is a complete separable metric space with a probability measure on its Borel sets. It also 
follows from the definition of Ti that every open set has positive measure. 

Define W : Ti x J 2 ->• [0, 1] by y) = /(y) for f £T\ and y £ J 2 - Then we can replace 
Ji by Ti and W by W , to get a weakly isomorphic graphon. Similarly, we can replace J 2 by T 2 . 
□ 

We say that two graphons (J, W) and (J', W) are isometric if there is an isometric bijection 
: J — ► J' that is measure preserving, and W'(0(x), 4>(y)) = W(x,y) for almost all x,y £ J. 
The definition for bigraphons is slightly more complicated: two bigraphons [J\, J 2 ,W) and 
( J{, J' 2 , W) are isometric if there are isometric, measure preserving bijections (f>i : J\ —> J[ and 
<f>2 '■ Ji — > T2 such that W'((j)i(x), 4> 2 {y)) = W(x,y) for almost all (x,y) £ J\ x J 2 . 

Theorem 3.4 // two pure [bijgraphons are weakly isomorphic, then they are isometric. 

Proof. We describe the proof for graphons. Theorem 2.1 (a) in [5] says that if two graphons 
(J, W) and (J', W) are weakly isomorphic, and they have no twins, then one can delete delete 
0-sets S C J and 5' C J' such that there is a bijective measure preserving map <j) '■ J\S —> J'\S' 
such that W'((j>(x), 4>{y)) = W(x, y) for almost all (x, y) £ J x J. We may even assume that for 
every x £ J\S, W'((j)(x), 4>{y)) = W(x, y) holds for almost all y (and vice versa), since this can 
be achieved by deleting further 0-sets. Clearly <fi preserves the metric. 

We also know that J \ S is dense in J (since (J, W) is pure and so its probability measure 
has full support), and so J is the completion of J\ S (and similarly for J'). Hence extends to 
an isometry between J and J', which shows that (J, W) and (J', W) are isometric graphons. □ 

Remark 3.5 Is purity the ultimate normalization of a graphon? There is still some freedom 
left: we can change the value of W on a symmetric subset of J x J that intersects every fiber 
J x {v} in a set of measure. We can take the integral of W (which is a measure ui on J), 
and then the derivative of u wherever this exists. This way we get back W almost everywhere, 
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and a well defined value for some further points. What is left undefined is the set of "essential 
discontinuity" of W (of measure 0). It would be interesting to relate this set to combinatorial 
properties of W. 

3.3 Density functions on pure [bi]graphons 

The following technical Lemma will be very useful in the study of rw and related distance 
functions. 

Lemma 3.6 (a) Let (J, W) be a graphon, F , a graph, and S C V, an independent set of nodes. 
Then the function t = ts(F, W; .) : J s — > K satisfies 

\t(x) — t(x')\ < \E\ maxrwfaiix'i). 

(b) Let (Ji, J2, W) be a bigraphon, let F = (17%, U2, E) be a bigraph, and let Si C Ui be such 
that no edge connects Si to S^- Then the function t = tg <j W, .) : jf 1 x J^ 2 — > R satisfies 

\t(x,y) - t(x',y')\ < \E\max{ma,xr 1 (x i ,x' i ),maxr 2 (yj,y' j )}. 

Remark 3.7 (i) It follows that the functions t in (a) and (b) are Lipschitz (and hence continu- 
ous). 

(ii) In both parts (a) and (b) of the Lemma, the graph F could have multiple edges. 

Proof. We describe the proof of (a); the proof of (b) is similar. For each i £ U \ S, let x% = x\ 
be a variable. Let E = {uiVi, . . . u m v m }, where we may assume that Wj S U \ S. Then 

« m ~ m 

t(x)-t(x')= I l[W(x Ui ,x Vi )dy- j l[W(x' ni ,x' Vi )dy 

ju\s 1 J u \ s 

Til „ 

= E / I[W(x Ut ,x Vl )(W(x U] ,x V] )-W(x' U] ,x' Vj ))l[w(x' Ut ,x'J,dy 

and hence 

m » 

\t(x)-t(x')\<J2 J \W(x Uj ,x Vj )-W(x' u .,x' v .)\dy. 

J = 1 JC/\S 

By the assumption that Vi € U \ S, we have x Vj = x' v . for every j, and so 
\t(x)-t{x')\ < Y\r w {x u ,x' ) < \E\ max r w (x l ,x' l ), 

3=1 

which proves the assertion. □ 



Lemma 13.61 has an important corollaries for pure graphons, which are closely related to 
Lemma 2.8 in [T3]. We do not formulate all versions, just a few that we need. 

Corollary 3.8 Let (J, W) be a pure graphon, and let F be a graph and let S C V, where S is 
independent. Then tg(F,W; x) is a continuous function of x £ J s . 
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Applying this when F is a path of length 2, we get: 



Corollary 3.9 For every pure graphon (J, W), W oW is a continuous function on J. 
Another application of Corollary 13.81 gives: 



Corollary 3.10 Let ( J, W) be a pure graphon, and let F\, . . . ,F m be graphs whose node set 
contains a common set S, which is independent in each. Let T C S , and let a\, ... , a m be real 
numbers. Let x £ J T , and assume that the equation 

m 

Y,a i t s (F i ,W;x,y)=0 (1) 

i=l 

holds for almost all y € J S \ T . Then it holds for all y € J S \ T . 

Proof. By Corollary 13.81 the left hand side of (QJ is a continuous function of (x,y), and so 
it remains a continuous function of y if we fix x. Hence the set where it is not is an open 
subset of J S \ T . Since the graphon is pure, it follows that this set is either empty of has positive 
measure. □ 

We formulate one similar corollary for bigraphons. 

Corollary 3.11 Let (Ji, J2,W) be a pure bigraphon, and let F\,...,F m be bigraphs with the 
same bipartition classes U\ and Ui. Let a\, . . . , a m be real numbers. Assume that the equation 



J2a i t b Ul (F i ,W;x)=0 (2) 

i=l 

holds for almost all x G J^ 1 . Then it holds for all x € J^ 1 . 

3.4 The similarity distance 

It turns out (it was already noted in |12| ) that the distance function ryyow defined by the 
operator square of W is also closely related to combinatorial properties of a graphon. We call 
this the similarity distance (for reasons that will become clear later). In explicit terms, we have 



r W ow(a,b)= / / W(a,y)W(y,x)dy- / W(b,y)W(y, x) dy 



dx 



j j 



W(x, y) (W(y, a) - W(y, b)) dy dx. (3) 



J J 



Remark 3.12 Let X, Y, Z be independent uniform random points from J, then we can rewrite 
the definitions of these distances as 

r w (a,b) = E x \W(X,a)~W(X,b)\, (4) 
r W ow(a,b) = Ex|E Y (W(X,Y)(W(Y >0 ) - W(Y,b)))\. (5) 

This formulation shows that this distance can be computed with arbitrary precision from a 
bounded size sample. We do not go into the details of this. 
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Lemma 3.13 If (J, W) is a pure graphon, then the similarity distance r\y w is a metric. 

So (J, rwow) is a metric space, and hence Huasdorff. We will show later that it is always 
compact. 

Proof. The only nontrivial part of this lemma is that rwow{x,y) = implies that x = y. 
The condition rw w(x,y) = implies that for almost all u € J we have (W o W)(x, u) = 
(W o W)(y,u), or more explicitly 



/ 



{W(x, z) - W(y, z))W(z, u) dz = 0. 



Using that ( J, W) is pure, Corollary 13.111 implies that this holds for every u g J. in particular, 
it holds for u — x and u — y. Taking the difference, we get that 

(W(x, z) - W{y, z)){W{z, x) - W{z, y)) dz = 0, 



J 

and hence W(x 7 z) = W(y, z) almost everywhere. Using again that (J, W) is pure, we get that 
x = y. □ 

For every x E J, the function W(x, .) is in Loo(J), and hence the weak topology of 
gives a topology on J. It is well known that when restricted to Loo(J), this topology is the 
weak-* topology on L^J), and hence it is metrizable, and the unit ball of Loo(J) is compact in 
it (Alaoglu's Theorem). A sequence of points (x n ) is convergent in this topology if and only if 



/ W(x n ,y)dy -> / W{x, y) dy 

J A J A 



for every measurable set A C J. We call this the weak topology on J. We need this name only 
temporarily, since we are going to show that rwow gives a metrization of the weak topology. 

Theorem 3.14 For any pure graphon, the metric rwow defines exactly the weak topology. 

Proof. First we show that the weak topology is finer than the topology of (J, rwow)- Suppose 
that x n — > x in the weak topology, and consider 

r W ow(x n ,x)= J J(W(x ni y)-W(x 7 y))W(y,z)dydz. 
J J 

Here the inner integral tends to for every z, by the weak convergence x n — > x. Since it also 
remains bounded, it follows that the outer integral tends to 0. This implies that x n — ¥ x in 
(J, rwow)- 

From here, the equality of the two topologies follows by general arguments: the weak topology 
is compact, and the coarser topology of rwow is Hausdorff, which implies that they are the same. 
□ 



Corollary 3.15 For every pure graphon (J, W), the space (J, rwow) is compact. 
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To compare the topology of (J,rw) with these, note that for any two points x,y £ J, we 
have 

r WoW (x,y) < r w (x,y), (6) 
which implies that the topology of (J,rw) is finer than the topology of (J, rwow)- 

3.5 Compact Graphons 

Graphons for which the finer space (J, rw) is also compact seem to have a special importance 
in combinatorics. Let us call such a graphon a compact graphon. 

Proposition 3.16 A pure graphon (J, W) is compact if and only if (J, rw) an d (J, rffoiv) define 
the same topologies. 

Proof. If the topologies (J, rw) and (J, rwow) are the same, then {J,rw) is compact by 
Corollary 13. 151 Conversely, if (J, rw) is compact then, by the argument used before in the proof 
of Theorem l3.14[ the coarser Hausdorff topology of (J, rwow) must be the same. □ 

Example 4 Let J = [0, 1], f(y) = |log(l/y)J, and define 

{Xf( y ), if x > 1/2 and y < 1/2, 
y m , if x < 1/2 and y > 1/2, 
0, otherwise, 

where x — O.iria^ . . . and y — 0. 2/12/2 • • • are the binary expansions of x and y, respectively. 
Then selecting one point from each interval [2~ fe+1 , 2~( fc )], we get an infinite number of points in 
([0, l],^) mutually at distance 1/4, so (J, W r ) is not compact, but by Corollary [333 {J, r Wow) 
is compact. So the two topologies are different. 

We conclude this section with an observation relating the topology of J to spectral theory. 

Lemma 3.17 Let ( J, W) be a pure graphon. Then every eigenfunction f £ L%(J) of W as 
a kernel operator belonging to a nonzero eigenvalue is continuous in the metric rwow (and 
therefore also in rw )■ 

Proof. It suffices to prove that / is continuous in (J, rw), since we can apply the argument to 
the graphon (J, W o W), which also has / as an eigenvector. 
First, we have 



<±Mi<lvh, 



W(x,y)f(y) dy 

and so / is bounded. We know by Corollary 13.91 that Wo W is continuous in (J,rw), and hence 

/ = ^2 J(WoW)(x,y)f(y)dy. 



so IS 



□ 
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4 Thin graphons 
4.1 The main theorem 

We say that a bigraphon W is thin if there is a bigraph F such that t\ nd (F, W) = 0. Trivially, if 
W is thin, then so is its complementary bigraphon 1 — W. 

We call a graphon thin if it is thin as a bigraphon. (Note: for this, it is not enough to require 
tind(F, W) = for some bipartite graph F. For example, consider the graphon U : [0, l] 2 — > [0, 1] 
defined by U(x,y) = U(y,x) = 1/2 if x £ [0,1/2] and y £ (1/2,1], and U(x,y) = 1 otherwise. 
As a bigraphon, this is not thin, but satisfies t- md (F, W) — for every bigraph with at least 3 
nodes in one of the classes. 

The (upper) packing dimension of a metric space (M, d) is defined as 

,. \ogN{e) 
hmsup- — — - T , 
log(l/e) 

where N{e) is the maximum number of points in M mutually at distance at least e. So this 
dimension is finite if and only if there is a d > such that every set of points mutually at 
distance at least e has at most e~ d elements. It is easy to see that we could use instead of N{e) 
the minimum number of sets of diameter at most e covering the space. 
Our main goal is to prove: 

Theorem 4.1 If a pure bigraphon (J\,J2,W) is thin, then (a) W(x, y) € {0,1} almost every- 
where, (b) Ji , J 2 are compact, and (c) J\ , J 2 have finite packing dimension. 

Remark 4.2 The proof will show that if ti n d(F, W) = for a bigraph F with k nodes, then the 
packing dimension of Ji is bounded by 10|F|. 

Before giving the proof, we describe a class of examples, and then recall some facts about 
the Vapnik-Cervonenkis dimension. 

Example 5 Let V be a finite or countable set, it, a probability measure on V, and define 
Ji = [0, l] y , Ji — [0, 1] x V, with the power measure on J\ and the product measure fi2 on 
Ji- We define a bigraphon on J\ x J 2 by 

W(x,y) = l t < Xi 

for x = (xi : i £ S) and y = (t, i). We can metrize this bigraphon by 



n(x,x') = ^tt( 



D\Xi - x t \ 



iev 

for x = (xi : i £ S), x' — {x\ : i £ S) £ J\, and 



r2(y,y') 

for y = (t,i), y' = {t',i') £ J 2 . 



\t-t'\ if * = 1', 
t + t' — 2tt' otherwise. 
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If V is finite, then (J\,ri) has dimension \V\, while ^2^2) has dimension 1, and both are 
compact. These facts also follow if we observe that W is thin. Indeed, if F denotes the matching 
with \V\ + 1 edges, then t\ nd (F,W) = 0, since among any \V\ + 1 points in J 2 , there are two 
points of the form y = (t, i) and y' — (t',i) with t < t' , and then W(., (t,i)) > W(., (t',i)). 

If V is infinite, then ( J±, r{) is infinite dimensional but compact, while (J2, r%) is not compact. 

Example 6 Let Ji = J2 — [0, 1], and let W(x,y) = where x = Q.X1X2 ... is the binary 

expansion of x, and f(y) = |~log(l/y)] . Then for x — 0.xix 2 ■ ■ ■ and x' = O.XiX 2 ■ ■ ■ we have 
ri(x,x') = J2'k > =i'2~ k \ Xk ~ x fcl' anc ^ f rom nere i s i s easy to see that ([0,1], ri) is compact. Fur- 
thermore, if S C [0, 1] is a set of points mutually more than 2~™ apart, then any two elements of 
S must differ in one of their first n digits, and so their number is at most 2™. Hence the packing 
dimension of ([0, 1], n) is 1. 

On the other hand, selecting a point y^ g [2~ fe , 2~( k ~ 1 '], we get an infinite number of points 
in ([0, 1], r 2 ) mutually at distance 1/2, so this space is not compact and infinite dimensional. 

4.2 Vapnik-Cervonenkis dimension 

For any set V and family of subsets T-L C 2 V , a set S C V is called shattered, if for every ICS 
there is a Y £ T-L such that X = Y n S. The Vapnik-Cervonenkis dimension or VC-dimension 
dimyc(^) of a family of sets is the supremum of cardinalities of shattered sets [16]. For us, k 
will be always finite. 

Let V be a probability space and T-L, a family of measurable subsets of V. A finite subfamily 
W is qualitatively independent if all the 2' w > atoms of the set algebra they generate have positive 
measure. The dual essential Vapnik-Cervonenkis dimension, or briefly DE-dimension, of % is a 
supremum of all cardinalities of qualitatively independent subfamilies of %. 

We recall two basic facts about VC-dimension: 

Lemma 4.3 (Sauer-Shelah Lemma) If a family H of subsets of an m-element set has VC- 
dimension k, then 

\U\< l + m + h 

For a family % of sets, we denote by t(T-L) the minimum cardinality of a set meeting every 
member of W. The following basic fact about VC-dimension was proved by Komlos, Pach and 
Woeginger [9], based on the results of Vapnik and Cervonenkis [16] (we do not state it in its 
sharpest form): 

Theorem 4.4 Let J be a probability space and, % a family of measurable subsets of J such that 
every A € % has measure at least e. Suppose that % has finite VC-dimension k. Then 

t(H) < Sfc^log^. 

We need a couple of further facts. For a family % of sets, let %(A)% = {AAB : A,B G H}. 
Lemma 4.5 For every family of sets, dimyc('H(A)'H) < lOdimvc('H)- 
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Proof. Set k = dimvc(^)- Let S be a subset of V = WH with m elements that is shattered by 
H(A)H). Then every ICS arises asl = (AAB) n S, where A,B £H. Since (AAB) H S = 
(A l~l S)A(B l~l S), the number of different sets of the form A n 5 is at least 2 m / 2 . By the 
Sauer-Shelah Lemma, this implies that 



mi 

k ' 



2 m/2 <l +m+ ... 

whence m < lOfc follows by standard calculation. □ 



Lemma 4.6 Lei H be a family of measurable sets in a probability space with VC- dimension k 
such that n(AAB) > e for all A,B EH. Then \H\ < (80k) k e~ 20k . 

Proof. Consider the family W = U(A)U. Every A € W has n(A) > 1/e, and dim V c(W) < 
lOfc by Lemma 1431 Hence by Theorem 14. 4[ we have 

t(H') < 80k-ln-. 

e e 

Let S C be a set of size t(H') meeting every symmetric difference AAB (A, B £ H). Then 
the sets S n A, A e H are all different. By the Sauer-Shelah Lemma, this implies that 



□ 



\H\ < l + \S\ + ■■■+ < |S| i0fc < (^80fc- In- ] (NO/,-) 



4.3 VC-dimension and graphons 

Lemma 4.7 Lei (Ji, J2, W) &e a pure 0-1 valued bigraphon. Then W is thin if and only if the 
DE-dimension of the family IZw — {svcpp(W(x, .)) : x £ T±} is finite. 

Proof. Suppose that this dimension is infinite. We claim that t*? nd (F, W) > for every bipartite 
graph F — (U, U',E). Let S C J 1 be a set such that the subfamily {supp(W(x, .)) : x £ T{\ is 
qualitatively independent. To each i £ U, assign a value Xi £ S bijectively. By Corollary 13. 11[ 
the set of points y £ Ji such that supp(VF(., y)) PI S = {xi : i £ N{j)} has positive measure for 
each j £U'. Hence tf nd (F, W) > 0. 

Conversely, suppose that k — &xm{lZw) is finite. Let F denote the bipartite graph with k + 1 
nodes in one class U and 2 fc+1 nodes in the other class U', in which the nodes in V have all 
different neig hborhoods. Then t^ nd (F, W) = 0. □ 



Remark 4.8 The proof above in fact gives the following quantitative result: tf nd (F, W) = for 
some bigraph F with k nodes in its smaller bipartition class if and only if dimDE(7?-w) < k. 



Proof of Theorem 14.11 We may assume that W is pure. 
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(a) Suppose that the bigraph F — (Ui,U2,E) satisfies t^ nd (F,W) — 0. Then for almost all 
x £ J i 1 , we have t^ i>ind (F, W; x) = 0. By Corollary ETfl it follows that t h Uiind (F, W;x)=0 for 
every x. In particular, [ nd (F, W; xq, . . . , xq) = for all xq £ J\. But for this substitution, 

t b Ultind (F,W;x ,...,x )= f J] W(x , yj ) d ^Hl-W(xo, yj ))^- d ^ 



''2 



and so for every xq we must have W(xq, j/o) G {0, 1} for almost all y$. 

(b) By Theorem l3.16l it suffices to prove that if W(x n , .), n = 1, 2, . . . weakly converges to /, 

i.e., 

lim / W(x n ,y)dy -> / f(y)dy 



n—too 

S S 

for every measurable set S C J 2 , then it is also convergent in ii. 

Claim 4.9 TTie weafc Zimii function f is almost everywhere 0-1 valued. 

Suppose not, then there is an e > and a set Y C J 2 with positive measure such that 
£ < fi x ) < 1 — e for x € y. Let S n = supp(W(x n , .)) n Y. We select, for every k>l,k indices 
rii, . . . rik so that the Boolean algebra generated by S ni , . . . S nfe has 2 fc atoms of positive measure. 
If we have this for some k, then for every atom A of the boolean algebra 

A(AnS„) = J W(x,y n )dx — > J f(x)dx (n-t oo), 
and so if n is large enough then 

£ -x(A)<x(Ans n )<[i- £ -)x(A). 



If n is large enough, then this holds for all atoms A, and so S n cuts every previous atom into 
two sets with positive measure, and we can choose n^+i = n. 

But this means that the DE-dimension of the supports of the W(x, .) is infinite, contradicting 
Lemma 14771 This proves Claim 1431 

So we know that f(x) £ {0, 1} for almost all x, and hence 



\\f-W{.,y n )\\ 1 = J (1-W(x,y n ))dx+ J W(x,y n )dx^Q. 

{/=!} {/=0} 

Thus W(.,y n ) f in L\, which we wanted to prove. 

(c) Let F = (Ux,U 2 ,E) be a bigraph such that $ nd (F,W) = 0, and let U l = [h]. We 
show that the packing dimension of J\ is at most 10^2. To this end, we show that if any 
two elements of a finite set Z C Jy arc at a distance at least e, then \Z\ < c(k)e~ 2k2 . Let 
H = {swpp(W(x, .)) : x£ Z}, then 

TT 2 (XAY)>£ (7) 

for any two distinct sets X, Y £ %. 
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Let A be the union of all atoms of the set algebra generated by H that have measure 0. 
Clearly A itself has measure 0, and hence the family H' — {X \ A : le T-L} still has property 

We claim that W has VC-dimension less than ki- Indeed, suppose that J2 \ A contains a 
shattered set S with \S\ = &2. To each j £ U2, assign a point qj £ S bijectively. To each i G Ui, 
assign a point pi € Z such that £ swpj>(W(pi, .)) if and only if ij € -E. This is possible since 
S is shattered. Now fixing the pi, for each j there is a subset of J2 of positive measure whose 
points are contained in exactly the same members of %' as qj, since qj (fi A. This means that the 
function t = t\ inA {F, W; .) : V/ 1 R satisfies > 0. Corollary [XTT] implies that t(x) > 
for a positive fraction of the choices of x £ J^ 1 , and hence t^ nd (F, VF) > 0, a contradiction. 

Applying Lemma|4~6]we conclude that \Z\ = \U\ < (80fc 2 ) lofc2 e _20fc2 • □ 

4.4 Hereditary properties and thin bigraphons 

A graph property V is a class of finite graphs closed under isomorphism. The property is called 
hereditary, if whenever G 6 P, then every induced subgraph is also in V . 

Let V be any graph property. We denote by V its closure, i.e., the class of graphons (J, W) 
that arise as limits of graph sequences in V . For every graphon W, let T(W) denote the set of 
those graphs F for which ti n< i(F, W) > 0. Clearly, T(W) is a hereditary graph property. 

Let V be a hereditary property of graphs. Then 

U W ^1(W)CV. (8) 

Indeed, if F ^ V , then t[ n d{F, G) = for every G £ V , since is hereditary. This implies that 
t- m &{F, W) = for all W e P, and so F (£ X(W). 

Equality does not always hold in ([8]). For example, we can always add a bigraph G and all 
its induced subgraphs to V without changing V . As a less trivial example, consider all bigraphs 
with degrees bounded by 10. This property is hereditary, and V consists of a single bigraphon 
(the identically function). 

Proposition 4.10 For a hereditary property V of graphs equality holds in ([S]) if and only if for 
every graph G £ V and v £ V(G), if we add a new node v' and connect it to all neighbors of v, 
then at least one of the two graphs obtained by joining or not joining v and v' has property V '. 

Proof. Suppose that this condition holds. Let F £ V have n nodes, and let F(k) denote a 
graph in V obtained from F by a repetition of this operation so that each original node has k 
copies. Then t\ n ^(F, F(k)) > l/n™. Let W be the limit graphon of some subsequence of the 
F(k) (k -> 00), then W G V. Furthermore, clearly t ind (F, W) > 0, and so F £ I{W). 

Conversely, assume that F = (V, E) £ T(W) for some W £ V, so that t ind (F, W) > 0. Let F' 
and F" be the two graphs obtained from F by doubling a node v (vv' ^ E(F'), but vv' £ E{F")), 
then 

J t ind (F,W;x)dx > 
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implies that there is a positive measure of choices for the values of x u (u € V(F) \ v), for 
which the set X of the choices of x v with t lnt \(F, W; x) > has positive measure. Clearly either 
W(x,y) < 1 for a positive measure of choices of (x,y) € Y or this holds for W(x,y) > 0. One 
or the alternative, say the first one, holds for a positive measure of choices for the values of x u 
(u G V(F) \ v). But then t(F', W) > 0. □ 

All of the above notions and simple facts extend to bigraphs and bigraphons trivially. 
Let us turn to thin graphons and bigraphons. The significance of thin bigraphons is supported 
by the following observation: 

Proposition 4.11 Let V be a hereditary bigraph property that does not contain all bigraphs. 
Then every bigraphon in its closure is thin. 

Proposition 14.111 and Theorem 14. 1 1 imply: 

Corollary 4.12 LetV be a hereditary bigraph property that does not contain all bigraphs. Then 
for every pure bigraphon (Ji, J 2 , , W) in its closure, W is 0-1 valued almost everywhere, and J\ 
and J2 are compact and their dimension is bounded by a finite number depending on V only. 

By this corollary, we can define, for every nontrivial hereditary property of bigraphs, a finite 
dimension. It would be interesting to find further combinatorial properties of this dimension. 
The natural analogue of this corollary for graph properties fails to hold. 

Example 7 Let V be the property of a graph that it is triangle-free. Then every bipartite 
graphon is in its closure, but such graphons need not be 0-1 valued, and their topology need not 
be finite dimensional or compact. 

However, if we include the (seemingly) simplest of the conclusions of Corollary 14.121 as a 
hypothesis, then we can extend it to all graphs. A graph property V is random-free, if every 
W € V is 0-1 valued almost everywhere. 

Theorem 4.13 Let V be a hereditary random-free graph property. Then for every pure graphon 
( J, W) in its closure, J is compact and finite dimensional. 

Before proving this theorem, we need some preparation. 

Lemma 4.14 For a hereditary graph property V, the following are equivalent: 

(i) V is random-free; 

(ii) there is a bigraph F such that t b (F, W) — for all W e V; 

(iii) there is a bipartite graph F with bipartition {U\,U2) such that no graph obtained from F 
by adding edges within U\ and U2 has property V '. 

Proof. (i)=^(iii): Assume that (iii) does not hold, then for every bigraph F there is a graph 
F e V and a partition V(F) = {Ui(F), U 2 (F)} such that the bigraph between Ui{F) and U 2 (F) 
is isomorphic to F. We want to show that V is not random- free. 
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Let (Fi , F2 , . . . ) be a quasirandom sequence of Digraphs with edge density 1/2, with the same 
number of nodes in each bipartition class. Consider the graphs F n , and let F' n and F^ denote 
the subgraphs of F n induced by U\{F n ) and Ui{F n ), respectively. By selecting a subsequence we 
may assume that the graph sequences [F[, F^ ■ ■ ■) (F", F%, . . . ) are convergent. By Lemma 4.16 
in [5], we can order the nodes of F' n so that Wf^ converges to a graphon ([0, 1], W) in the cut 
norm ||.||n, and similarly, Wf" converges to a graphon ([0, 1], W") in the cut norm. We order 
the nodes of F n so that the nodes in F^ preceed the nodes of F", and keep the above ordering 
otherwise. Then trivially Wp converges to the graphon 

(W(2x,2y) Hx,y<l/2, 
U(x, y) = I W"{2x - 1, 2y - 1) if x, y > 1/2, 
I 1/2 otherwise. 

So U £ V is not 0-1 valued, and so V is not random-free. 

(ii) =^(i): Suppose that V is not random-free, and let (J, W) £ V be a graphon that is not 
0-1 valued almost everywhere. Then by Theorem 14. 1[ it is not thin as a bigraphon, which means 
that for every bigraph F — (Ui,U2,E), t*? nd (F, W) > 0, so (ii) is not satisfied. 

(iii) =>(ii): Consider a bigraph F = (Ui, U2, E) as in (hi), and consider it as a bipartite graph 
on V — U\ U U2 (we assume that U\ n U2 — 0). Suppose that it does not satisfy (ii), then there 
is a graphon W £ V such that t(F, W; x) > for a positive measure of choices of the x £ J v . 
For every such choice, we define a graph F' by connecting those pairs {i, j} of nodes of F for 
which W(xi,Xj) > and either i,j £ U\ or i,j £ 1/2- The same supergraph F' will occur for a 
positive measure of choices of the Xi, and for this F' we have ti n d(F', W) > 0, so using ([8]), we 
get F' £ X(W) C V, a contradiction. □ 

Proof of Theorem 14.131 By Lemma |4.14[ there is a bigraph F such that t b (F, W) — for all 
W £ V . Thus Theorem 14.11 implies the assertion. □ 



5 Regularity partitions 

5.1 Weak and strong regularity partitions 

The Regularity Lemma of Szemeredi [141 115] , and various weaker and stronger versions of it are 
basic tools in the study of large graphs and graphons [12 . Our goal is to show that it is also 
closely related to the topology of graphons. 

Let (J, W) be a graphon and V, a partition of J into measurable sets with positive measure. 
For x £ J, let S(x) denote the partition class containing x. Define 

M x ) = {of w I f( x ) dx 
7r(5(x)) J 

S(x) 

for a function / £ L\{J), and 

Wp{x,y) = , . rr , , / W(x,y)dx. 

ir{S{x))ir(S(y)) J 

S(x)xS(y) 
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We say that V is a weak regularity partition with error e, if \\W — Wp\\\j < s. 

We define a Szemeredi partition of a graphon with error £ as a partition V = {S% U • • • U Sk} 
of J into measurable sets such that 



for every function H : J X J — > [0, 1] that is 0-1 valued and whose support is the union of 
product sets i?y = i?^ x JJL C 5j x Sj (i, j <E [fc]). To relate this to the weak partitions, we 
note that \\W — W-p\\jj < e can be expressed as ([9]) for all functions h of the form lsxT- (The 
formulation above is not a direct generalization of Szemeredi's definition, but it is closest in our 
setting; cf. [T2].) 

A strong regularity partition of a graph was introduced by Alon, Fischer, Krivelevich and 
M. Szegedy pQ. Here the error is specified by an infinite sequence £ = (eo,ei, . . . ) of positive 
numbers. Again recasting it in our setting, V is a strong regularity partition with error £ of a 
graphon (J, W) if there is a graphon (J, U) such that 



Even stronger would be, of course, to require that ||W — Wp||i < e (equivalently, (HJ) holds for 
all measurable functions H : J X J — > [—1, 1]). In this case we call V an ultra-strong regularity 
partition with error e. 

The following result is a graphon version of the original Szemeredi's Regularity Lemma [141 
[To] , its "weak" form due to Frieze and Kannan [8], and its strong form due to Alon, Fischer, 
Krivelevich and M. Szegedy pQ. It was proved for graphons in [T2] . 

Theorem 5.1 Let (J, IF) 6e a graphon on an atomfree probability space. Then 

(a) for every e > ( J, IF) has a Szemeredi partition with error e into no more than T(e) 
classes, where T{e) depends only on e; 

(b) for every e > ( J, IF) has a weak regularity partition with error e into no more than 
2 2 ^ 2 classes. 

(c) for every sequence £ = (sq,Ei, . . .) of positive numbers, (J, IF) has a strong regularity 
partition of (J, IF) with error £ into no more than T{£) classes, where T{£) depends only on £ . 

Remark 5.2 (i) We note that every graphon has an ultra-strong partition with error e by 
standard results in analysis, but the number of classes cannot be bounded uniformly by any 
function of e. 

(ii) In the usual formulation, partitions in the Regularity Lemma are equitable, i.e., the 
partition classes are as equal as possible. For graphons on atomless probability spaces, the 
classes can be required to have the same measure. In fact, it is easy to see that the partitions 
constructed e.g. in Corollary 15 .41 and Theorem 15.81 below can be repartitioned so that the classes 
will be as equal as possible, the error is at most doubled, and the number of classes is increased 
by a factor of at most f 1 . 



\(W-W V ,H)\ <e 



(9) 



\\W-U\\i<e Q 



and \\U — W-p\\u < £|-p|- 
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Several other analytic aspects and versions of the Regularity Lemma were proved in |12) . 
One of these results made a connection between regularity partitions and partitions of J into 
sets with small diameter in the rwow metric. Here we prove a stronger, cleaner version of that 
result, and then show how to combine it with our results on thin graphons to get better bounds 
on the number of partition classes in weak regularity partitions of this graphons. 

5.2 Voronoi cells and regularity partitions 

We show that Voronoi cells in the metric spaces (J, Rw) an d {J, Rwow) are intimately related 
to different versions of the Regularity Lemma. 

Let (J, d) be a metric space and let 7r be a probability measure on its Borel sets. We say that 
a set S C J is an average e-net, if Jj d(x, S) di:(x) < e. 

Let S C J be a finite set and s £ S. The Voronoi cell of S with center s is the set of all 
points x £ J for which d(x, s) < d(x, y) for all y £ S. Clearly, the Voronoi cells of S cover J. 
(We can break ties arbitrarily to get a partition.) 

Theorem 5.3 Let ( J, W) be a graphon, and let e > 0. 

(a) Let S be an average e-net in the metric space (S, rwow)- Then the Voronoi cells of S 
form a weak regularity partition with error at most 8y/e. 

(b) Let V = { Ji, . . . , Jfc} be a weak regularity partition with error e. Then there are points 
Vi € Ji such that the set S — {v\, . . . , Vk} is an average (ie)-net in the metric space (S, rwow)- 

Proof, (a) Let V be the partition into the Voronoi cells of S. Let us write R = W — Wp. We 
want to show that ||i2||p < It suffices to show that for any 0-1 valued function /, 



Let us write g = f — f-p, where fv(x) is obtained by replacing f{x) by the average of / over the 
class of V containing x. Clearly (f-p, Rfp) = 0, and so 



For each x £ J, let f(x) £ S be the center of the Voronoi cell containing x, and define W'(x, y) — 
W(x,cj)(y)) and similarly R'(x,y) — R(x,4>(y)). Then using that (W — R)g — Wpg = 0, 
W — W = R — R' and R'g = 0, we get 



(f,Rf)<2y/i. 



(10) 



(/, Rf) = (g, Rf) + (f v , Rf) = (/, Rg) + (f v , Rg) < 2\\Rg\\x < 2\\Rg\\ 2 . 



(11) 



\\Rg\\l = (Rg, Rg) = (Wg, (R - R')g) = (Wg, (W - W')g) = (g, W(W - W')g) 




W(x,y)(W(y,z) -W(y,<p(z))dy dxdz 



j 2 j 




j 



This proves ([TP]). 
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(b) Suppose that V is a weak Szemeredi partition with error e. Let R — W — Wj>, then we 
know that \\R\\n < £ - 

For every x € [0, 1], define 



F(x) = / / R(x,y)W{y,z)dy 



dz. 



J J 



Then we have 

/ F(x) dx — J s(x, z)R(x,y)W(y, z) dx dy dz, 
J J 3 

where s(x, z) is the sign of J R{x, y)W{y, z). For every z € J, 

J s(x, z)R(x,y)W(y, z) dx dy < 2\\R\\ n < 2s, 



J2 



and hence 



(12) 



Let x,y € J be two points in the same partition class of V. Then W-p(x, s) — W-p(y, s) for 
every s € J, and hence 



rwO,</)= / /(W(a:,«)-W(i/,«))W(a,2)dfl 



J J 



s) - R(y, s))W(s, z) ds 



j j 



< 



R(x, s)W(s, z) ds 
J J 

F(x) + F(y). 



dz 



dz 



R(y,s)W(s,z)ds 



dz 



J J 



For every set T g V , let vt £ T be a point "below average" in the sense that 

1 



F(vr) < / F(x) dx 



and let 5 = {v T : Te Then using (JI2J, 

E x d(X,S)<^ f d(x,v T )dx<Y^ j ' (F(x) + F(v T )) dx 

< / F(a;) + ^ \{T)F(v T ) < 2 / < 4e. 



This proves the Theorem. 



□ 



Theorems 15.31 and 14.11 imply the following Corollary (we prove a stronger result in the next 
section). 
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Corollary 5.4 For every bigraph F — (V,E) there is a constant cf > such that if G is a 
graph not containing F as an induced sub-bigraph, then for every e > 0, G has a weak regularity 
partition with error e with at most cpe~ 10 ^ v ^ classes. 

Remark 5.5 The conclusion does not remain true if the subgraph we exclude is nonbipartite. 
Any bipartite graph will then satisfy the condition, and some bipartite graphs are known to need 
an exponential (in 1/e) number of classes in their weak regularity partitions. 

5.3 Edit distance 

We conclude with deriving bounds on the size of the Szemeredi partitions and approximations 
in L\, using the packing dimension of [J, rw)- In the graph theoretic case, this corresponds to 
approximation in edit distance. 

Lemma 5.6 Let W be a graphon such that (J,rw) can be covered by m balls of radius e. Then 
there is a stepfunction U with m(l/e) m steps such that \\W — U\\i < 2e. 

Remark 5.7 If W is 0-1 valued, then the bound on the number of classes can be improved to 
to2"\ 



Proof. Let V = { J\, J2, . . . , J m } be a partition of J into measurable sets such that for every 
i there is xi € J with — W(x, .)||i < e for every x € Ji. Let W'(x, y) = W(xi,y) for 

x € Ji, then trivially \\W — W'\\i < e. Let Qi be a partition of J into 1/e measurable classes so 
that W(xi, .) varies at most e on each class of Qi. For x £ Ji and y G S G Qi, define 

U(x,y) = ^— f W'(x,z)dz. 

Then clearly \U{x,y) - W'{x,y)\ < e for all x,y £ J, and hence \\U - W\\i < \\U - W'\\i + 
|| — W'||i < 2e. It is obvious that U is a stepfunction in the partition generated by V and 
Qi, • • • j Qm, which has at most m(l/e) m classes. □ 

We obtain from this lemma: 

Theorem 5.8 Let W be a graphon such that (J, rw ) has packing dimension d, then for every 
< e < 1 it has an ultra-strong partition with error e and with at most e~°^ £ ) classes. 

Proof. Consider a maximal packing in (J,rw) of balls with radius e/8; this consists of 
to = 0(e~ d ) balls. The balls with the same centers and with radius e/4 cover J, so Lemma 
there is a stepfunction U with m(A/e) m < e~ ce steps such that || W — f/||i < e/2. For the 
partition V into the steps of U , we have 

||W- W r \\i < 2\\W - U\\i < e 

(the first inequality follows by easy computation). □ 
For thin graphons, we get a stronger bound. 
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Theorem 5.9 Let W be a thin graphon in which a bigraph F = (V, E) is excluded as an induced 

sub-bigraph. Then for every < e < 1, it has an ultra-strong partition with error e and with 
O ( e -io|v| 2 ) c / asses . 

Proof. Theorem 14.11 implies that W is 0-1 valued and it has finite packing dimension at 
most 10 1 | . Similarly to the proof of lemma fSTBT let V — { Ji, J2, • • • , J m } be a partition of 
J with to = 0(e~\ v \) into measurable sets such that for every i there is an Xi G J with 
\\W(x i ,.)-W(x,.)\\ 1 < e for every x G J,. Let W'(x,y) = W{x t ,y) for x € J h then ||W'-W||i < 
e. Let Si be the support of the function W(xi, .), and let A be the set of atoms of the Boolean 
algebra generated by {Si,S 2 , ■ ■ ■ , S m } with positive measure. For every atom a £ A, let F a C [to] 
denote the index set {i\a C Si} and let T denote the set system {F a |o (E A}. Since F is not an 
induced sub-bigraph, T has VC-dimension less than |V|, and so by lemma 14731 we obtain that 
\A\ < C^toI^I- 1 ). The joint refinement V 2 of A and V is of size at most O(e" 10|F|2 ). This 
completes the proof since W is a stepfunction with steps in V%. □ 

It is easy to see that in the definition of ultra-strong regularity partitions of 0-1 valued 
graphons, we can replace W-p by a 0-1 valued stepfunction with the same steps, at the cost of 
doubling the error. Together with Remark l5.2[ we can apply this to a (large) finite graph G. To 
state the result, we need a definition. Let H be a simple graph, and let us replace each node v 
of H by a set S v of "twin" nodes, where two nodes x € S u and y G S v are connected if and only 
if uv € E(H). For each u G V(H), either connect all pairs of nodes in S u , or none of them. Wc 
call every graph obtained this way a blow-up of H . 

Corollary 5.10 For every bigraph F there is a constant cf > such that if G is a graph not 
containing F as an induced sub-bigraph, then for every e > 0, we can change s\G\ 2 edges of G 
so that the resulting graph is a blow-up of a graph with at most Ci?£ _10 ' F ' nodes. 

Let us say that a graphon W has polynomial L\- complexity if there is a d > such that for 
every e > there is a stepfunction W with 0(e~ d ) steps satisfying \\W — W'\\i < e. We can 
define polynomial □- complexity analogously. As we have pointed out, polynomial □-complexity 
corresponds to the finite dimensionality of the metric space of Wo W . Theorem 15 .91 implies that 
every thin graphon has polynomial Li-complexity. 

If W has polynomial complexity, then the structure of W can be described by a polynomial 
number (in 1/e) of real parameters with an error e in the appropriate norm. The set of graphons 
with polynomial complexity is closed under many natural operations such as operator product, 
tensor product, etc. 

It could be interesting to study other aspects of this complexity notion. We offer a conjecture 
relating our complexity notion to extremal combinatorics. It is supported by examples in [13] . 

Conjecture 5.11 Let F\, F 2 , . . . , F n be a set of finite graphs, t\, t 2 , ■ ■ ■ , t m be real numbers in 
[0, 1] and S be the set of graphons W with t(Fi, W) = tj for 1 < i < n. Then S is either empty 
or it contains a graphon of polynomial 7^ -complexity. 
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